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Abstract 

The present study explores the relationship between controlled productive 
knowledge of collocations and L2 proficiency, the role of frequency in controlled 
productive knowledge of collocations, and the quantifiability of controlled pro¬ 
ductive collocational knowledge growth alongside L2 proficiency and word fre¬ 
quency levels. 

A proficiency measure and a productive collocation test modelled on Laufer 
and Nation (1999) were presented to Belgian and Burundian English majors. The 
results show that scores on both tests distinguish between proficiency levels and, 
furthermore, highly correlate. This suggests that controlled productive knowledge 
of collocations develops as proficiency increases, supporting earlier studies (Boers, 
Eyckmans, Kappel, Stengers, & Demecheleer, 2006; Bonk, 2001; Eyckmans, Boers, 
& Demecheleer, 2004; Gitsaki, 1999) that had established a relationship between 
collocational knowledge and L2 proficiency. The results also show that the more 
frequent the collocations, the better they are known, which highlights the crucial 
role played by frequency in knowing words (Nation & Beglar, 2007). Furthermore, 
the number of collocations added can be quantified and we observe moderate 
gains at beginner and advanced levels, and impressive gains at intermediate levels. 
This supports and extends Laufer’s (1998) and Zhong and Hirsh's (2009) findings 
and lays basic ground work for teaching collocations, the amount of which should 
increase with proficiency levels. 

Keywords : quantifying, controlled productive knowledge, L2 proficiency, 
frequency levels 
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Collocations have attracted increased research attention over the past 
decades, and four fundamental questions have been examined. While two of 
the questions, namely, the importance of collocations and the relevance of ex¬ 
plicitly teaching them, have been properly addressed and are not contentious 
issues anymore, the two others, namely, how to teach collocations and exactly 
what collocations to teach, do not seem to have been properly addressed. 

Over the past few years, a number of studies have demonstrated the im¬ 
portance of collocations in an L2 context (see among others Cowie, 1998; 
Granger & Meunier, 2008; Howarth, 1998; Nesselhauf, 2005; Pawley & Syder, 
1983; Wray, 2002). Collocations have been found to characterise L2 proficiency, 
with empirical evidence showing that collocational knowledge develops along¬ 
side proficiency both receptively (Eyckmans, 2009; Gyllstad, 2007, 2009; 
Keshavarz & Salimi, 2007) and productively (Bonk, 2001; Eyckmans, Boers, & 
Demecheleer, 2004; Gitsaki, 1999; Nizonkiza, 2011a). Subsequent to the grow¬ 
ing importance and significance attributed to collocations in research, several 
calls to teach vocabulary/foreign language with special emphasis on collocations 
have been made. Many scholars have recommended teaching collocations ex¬ 
plicitly as a way forward in foreign language teaching (see among others Boers, 
Eyckmans, Kappel, Strengers, & Demecheleer, 2006; Lewis, 1993, 1997, 2000; 
M artynska, 2004; Nattinger & DeCaricco, 1992). However, neither how to teach 
collocations nor exactly what to teach have been properly addressed so far. 

Recently, pedagogical experiments have been conducted in order to ad¬ 
dress the 'how' to teach collocations. Although no common teaching method 
has been adopted so far, different studies point to the general observation 
that raising learners' awareness of the phenomenon of collocations consti¬ 
tutes the best strategy to adopt while teaching collocations (see among others 
Barfield, 2009; Boers et al., 2006; Boers & Lindstromberg, 2008; Coxhead, 
2008; Jiang, 2009; Peters, 2009; Wray & Fitzpatrick, 2008; Ying & O'Neill, 
2009). Two approaches, that is, the awareness-raising and attention-drawing 
techniques, which are basically similar in nature and which find their theoreti¬ 
cal ground in Nation's (2001) three psychological conditions, that is, noticing, 
retrieving, and generation (Coxhead, 2008), have been tried. 

The awareness-raising approach was trialled in different contexts by 
means of different tasks, the different studies pointing to the same observation 
that raising L2 learners' awareness of collocations is efficient. It helps learners 
overcome the fundamental problem they have when learning collocations. L2 
learners generally attend to individual words, breaking the collocation down 
into separate units, which impinges on their fluency as they have to reconstruct 
the words in appropriate pairings at the time of use (Barfield, 2009; Wray, 
2002). This approach, which puts awareness-raising activities at the front in 
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teaching collocations, was felt to be an option to fill the gap in teaching colloca¬ 
tions/ multiword units. The attention-drawing technique also referred to as the 
effectiveness of 'phrase-noticing' is an approach inspired by Lewis's Lexical Ap¬ 
proach that has been put to the test by Boers et al. (2006) among others. Boers 
et al. (2006) measured the possible gains in terms of oral proficiency of partici¬ 
pants as a result of phrase-noticing activities in which they had taken part. The 
authors came to the conclusion that the phrase-noticing approach helps stu¬ 
dents to recognize chunks/collocations that they are able to use in real conver¬ 
sations, therefore improving their oral proficiency. 

Equally important is the question of 'what' collocations to teach, and it is 
far from being properly addressed. Collocation dictionaries, which provide 
common collocations, are helpful for teachers and L2 learners and can be relied 
on in this perspective. For instance, an assessment of the The BBI Combinatory 
Dictionary of English: A Guide to Word Combinations (Benson, Benson, & llson, 
2010) and Macmillan Collocations Dictionary for Learners of English (2010) 
shows that they are indeed important for learners and teachers. According to 
Ptaszynski (2011), the BBI dictionary presents data useful to a heterogeneous 
audience (students, teachers, translators, writers, etc.) and can therefore be 
referred to as a one-size-fits-att dictionary. The dictionary provides its users with 
useful and detailed information particularly important for learners of English 
who want to improve their productive skills. However, very little is known about 
the proficiency levels of the learners, the nature of the text they want to write, 
and their mother tongue, which makes it hard to believe that the data present¬ 
ed in the dictionary and its accessibility match the profile and needs of its pro¬ 
spective users. Therefore, the dictionary ". .. remains a dictionary of a linguist, 
by a linguist, and for a linguist" (Ptaszynski, 2011, p. 151). 

Coffey (2011), who has assessed the Macmillan Collocations Dictionary 
for Learners of English (2010), finds it well planned as a pedagogical dictionary. 
It offers learners ways to find relevant collocations easily, for instance, by 
grouping collocates in semantic sets with their meanings provided. However, 
the dictionary does not have an overview of the collocations on which it fo¬ 
cuses and does not draw more attention of learners to collocational patterns 
such as verb and adjective headwords that lead to noun collocates learners 
may otherwise overlook. 

Wible, Kuo, Chen, Tsao, and Hung's (2006) tool, namely, the 
COLLOCATOR, which basically functions in the same way as collocation diction¬ 
aries, was designed in an attempt to help teachers/learners find out which col¬ 
locations to teach/learn. The COLLOCATOR is a web-based tool, which once ac¬ 
tivated, selects and detects the multiword expressions from the British National 
Corpus (BNC) occurring on the webpage a user is viewing. They are highlighted 
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and presented in pairs. This tool presents a significant turn for extracting and 
determining which collocations are important, thus helping both learners and 
teachers to focus on common collocations. However, the large amount of collo¬ 
cations that the COLLOCATOR detects may be confusing, especially for learners 
who need to be helped as to which collocations to attend to (Wible, 2008). Like 
dictionaries of collocations, 1 the COLLOCATOR does not specify which colloca¬ 
tions to teach at which level of proficiency, an issue that needs exploring. 

In view of the above, I believe that tracking the collocational knowledge 
growth as proficiency develops and across word frequency bands may be one 
way to address this question. The present study has been initiated in this light 
and builds on the established relationship between collocations and L2 profi¬ 
ciency in order to study the measurability of productive collocational 
knowledge growth. It replicates Nizonkiza (2011b) and will pursue the same 
objectives, namely, (a) the extent to which controlled productive collocational 
knowledge increases as overall L2 proficiency develops, (b) the extent to 
which controlled productive collocational knowledge of L2 learners develops 
according to word frequency levels to which it adds, and (c) the quantifiability 
of collocations gained according to proficiency and word frequency levels. 

The first aim of the study was motivated by research findings according 
to which a strong relationship between receptive collocational competence 
and L2 proficiency exists (Gyllstad, 2007, 2009; Keshavarz & Salimi, 2007; 
Nizonkiza, 2011a). The pertinent question here is whether or not the same 
holds for productive knowledge of collocations. The assumption is that the 
same relationship should be logically found for productive knowledge of collo¬ 
cations (cf. Bonk, 2001; Gitsaki, 1999), or controlled productive knowledge, in 
the present case. Empirical evidence suggests that productive knowledge al¬ 
ways lags behind receptive knowledge (Jaen, 2007; Laufer, 1998; Laufer & 
Paribakht, 1998) and that learning vocabulary in general, and passing from 
receptive to productive knowledge in particular, is not a linear activity (Laufer, 
1998; Meara, 1996; Melka, 1997; Read, 2004). I therefore assume that con¬ 
trolled productive knowledge of collocations increases with L2 proficiency, but 
the gain from one level of proficiency to another is not always significant, a 
hypothesis that was confirmed in the original study. However, the sample 
population consisted of three levels of proficiency at the beginner and low 
intermediate levels and Nizonkiza (2011b) suggested replicating the study in 
order to include more levels, which is the raison d'etre of the present study. 

The second issue explored is the extent to which controlled productive 
knowledge of collocations is influenced by word frequency, as it has been demon- 


1 For a comprehensive overview of collocation dictionaries, I refer the reader to Handl (2009). 
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strated that the more frequent words are better known at the vocabulary size 
level (Beglar, 2010; Nation, 1983; Nation, 1990; Nation & Beglar, 2007) and for 
receptive collocational competence (Gyllstad, 2007; Nizonkiza, 2011a). It was 
shown that controlled productive collocational competence of 12 learners in¬ 
creased from the less frequent to the more frequent word levels in the original 
study, which will be tested in the present study with more levels of proficiency. 

A twofold question, not tackled in the original study, has been added. As 
controlled productive knowledge of collocations grows with proficiency and 
word frequency levels, it makes sense to reflect about the extent to which we 
can quantify the collocations gained (a) from one level of proficiency to another; 
and (b) from one word frequency level to the next. In other words, the following 
question will be answered: If collocational knowledge develops with proficiency 
and word frequency levels, is the knowledge acquired quantifiable? 

In brief, the present study will test the following assumptions: 

1. Controlled productive knowledge of collocations grows with proficien¬ 
cy, but the gain from one level of proficiency to another is not always 
significant. 

2. Controlled productive collocational competence of L2 learners in¬ 
creases from less frequent to more frequent word levels. 

3. As collocational knowledge develops with proficiency and word fre¬ 
quency levels, the knowledge added can be quantified and, following 
the nonlinear nature of vocabulary growth in general (cf. Laufer, 1998; 
Meara, 1996; Melka, 1997; Read, 2004), the gains are dependent on 
both proficiency and word frequency levels. 

M easuring Vocabulary Growth 

Research in vocabulary has, among other things, tried to measure vo¬ 
cabulary growth. Nation's (1990) Vocabulary Levels Test (VLT), which requires 
"learners to match target words to their synonyms or definitions" (Read, 2000, 
p. 171) is the most widely used matching test for this purpose (Ishii & Schmitt, 
2009; Read, 2007). It involves word definition matching in either sense, name¬ 
ly, word-definition or definition-word matching. 

Findings from measuring vocabulary size have come up with interesting 
pedagogic and research implications that are considerable both for teaching 
and research, enabling syllabus and material developers to (a) design what 
may be an optimal syllabus, namely, one that brings in optimal conditions for 
the learning/teaching activities to succeed (Laufer, 1998; Schmitt, Schmitt, & 
Clapham, 2001); and (b) decide on how many words to teach in a unit and how 
to teach them (Read, 2000). They enable researchers to (a) quantify the 
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threshold instruction for comprehending written materials (Laufer, 1998) and 
(b) use the materials generated for studying the relationship between vocabu¬ 
lary growth and the learning conditions (Laufer, 1998). For a complete over¬ 
view of vocabulary size and text coverage, I refer the reader to Nation and 
Waring (1997) and Nation (2006). 

Measuring vocabulary growth has been extended to productive 
knowledge. Laufer and Nation (1999) adapted the VLT and came up with its 
active version, which measures the controlled productive ability, where each 
test item is presented in a sentential context with the first two letters provid¬ 
ed and the test-takers' role is to fill in the missing letters (Laufer & Nation, 
1999). However, whenever two letters can start two words, a third letter is 
added in order to disambiguate the cue. 

Laufer (1998) used this test in order to compare three types of vocabu¬ 
lary knowledge, namely, receptive, 2 free productive, and controlled produc¬ 
tive, after one year of instruction. The study involved two groups of Israeli 
English learners with six and seven years of exposure to the language. The 
study examined the gains in these types of knowledge, how they are related to 
one another, and the changes that occur in these relationships. Laufer (1998) 
observed that both receptive knowledge and controlled productive vocabulary 
progressed well, but with more progress at the receptive level, while free pro¬ 
ductive vocabulary did not progress at all. The receptive vocabulary size was 
found to be larger than controlled productive size, with a larger gap in the 
more advanced group. 

Zhong and Hirsh (2009) used an adapted version of the controlled pro¬ 
ductive test to examine the growth of controlled productive knowledge and 
compare it to receptive knowledge. The study involved high school students in 
China. The test presented to participants consisted of items selected from the 
2000-word, 3000-word, 5000-word levels and the Academic Word List (AWL). 
It was administered in pre- (third week of class) and post-experimental condi¬ 
tions (10 weeks later). As indicated by the findings, both receptive and con¬ 
trolled vocabulary knowledge grow significantly at some word levels after a 
10-week course. Overall, greater growth was observed at the controlled pro¬ 
ductive knowledge than the receptive knowledge, but the receptive 
knowledge was larger than controlled productive knowledge at all the levels. 
However, the gap between the two lessened after 10 weeks of study. My 
study is in line with Zhong and Hirsh's (2009) study and will measure con¬ 
trolled productive collocational knowledge growth. 


2 1 adopted the terms mostly used in the literature although Laufer (1998) used passive, 
active, and controlled active. 
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Quantifying Controlled Productive Collocational Competence 
Across Proficiency Levels 


Sample Population 

English majors from a university in Belgium and in Burundi participated 
in the study. The first data set was collected from English majors in Burundi 
and the results are reported in Nizonkiza (2011b). Participants from Burundi 
are aged between 20 and 26. They speak Kirundi, their mother tongue; French, 
a language of wider communication in Burundi and used in official matters; 
and Swahili (for a few of them), a lingua franca of East Africa. They were se¬ 
lected from year one (,n =36), year three (n =44), and year four 3 [n =36) using 
the systematic random sampling technique 4 (cf. Babbie, 1990; Dagnelie, 1992). 
Participants sat the tests on three successive days in the following order: year 
four, year three, and year one. They were invited by their lecturers and sat the 
tests in two sessions (TOEFL first and the collocation test afterwards) with a 
short pause in between (30 minutes). TOEFL was administered and marked 
following the Educational Testing Service's instructions. As regards the colloca¬ 
tion test, students were required to follow the instructions and an example 
was provided. The test lasted 10 to 30 minutes and students were awarded 1 
point per correct answer. TOEFL scores were used in order to determine the 
proficiency levels of the participants. Their scores ranged between 310 and 
506 and the mean scores of the groups are 335,17 in Level 1; 386.40 in Level 2; 
and 444.63 in Level 3 (paper-based TOEFL total score is 677); levels were con¬ 
firmed as different by a post-hoc analysis test (Scheffe). 

Given the low level of proficiency of participants (from beginner to in¬ 
termediate), Nizonkiza (2011b) recommended replicating the study in order to 
include more English majors and therefore get more levels of proficiency. 
Then, 100 Belgians doing English majors, almost at the end of their first year at 
the university (end of April), with Dutch as their LI, aged between 18 and 20, 
volunteered to participate. The students were invited through their lecturer in 
a proficiency course. Those who attended the following class a week later par¬ 
ticipated. I was allowed in 20 minutes before the class ended and presented 
the test, which lasted 5 to 15 minutes. 


3 In Burundi, the bachelor degree is organised in four years. Year two could not be includ¬ 
ed in the study because the data was collected towards the end of the year and second 
year students who had finished their exams were away. 

4 According to the technique, every nth subject is selected from a random starting point. 
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The Belgian students had sat an old paper-based version of TOEFL for 
other purposes and their level of proficiency was quite high with scores rang¬ 
ing from 493 to 657. The Burundian and the Belgian data were encoded and 
put in the same data set. However, 30 of the Belgian students who either did 
not finish the collocation test or who did not have any TOEFL scores were ex¬ 
cluded from the analysis. After merging the two data sets, participants were 
allocated to proficiency levels on the basis of their TOEFL scores. Bearing in 
mind Bouma's (1984) suggestion that a group should consist of at least 30 
candidates for statistical reasons, five levels of proficiency were distinguished. 
Level 1 (n =33) scored between 310 and 356; Level 2 (n =42) scored between 
360 and 410; Level 3 (n =40) scored between 413 and 493; Level 4 (n =40) 
scored between 503 and 577; while Level 5 (n =30) scored between 580 and 
657. A Scheffe analysis test was run and confirmed that the different profi¬ 
ciency levels belonged to different groups. 

The Test Battery 

A controlled productive test of collocations (see Appendix B) was devel¬ 
oped and presented to participants. Frequency of words and their syntactic 
nature guided the selection and only verb-noun (V + N) combinations were 
retained. The V + N combinations constitute the collocations investigated in 
this study for the three reasons explained in Gyllstad (2007), namely, (a) they 
constitute frequent occurrences, (b) they are very difficult for L2 learners, and 
(c) they contain the most important information for communication. The 
fourth reason is that when we express ourselves, we do not think of the verb 
first. We tend to start with the noun, standing for the action we want to do 
and then think of a verb which goes with it, which stands for how to do the 
action (Oxford Collocations Dictionary for Students of English, 2002). 

The target words were selected from Nation's (2006) word frequency 
count, a database of word families based on the BNC and organised in frequen¬ 
cy bands of 1000 words each. Words were selected from the 2000-word, 3000- 
word, 5000-word levels (cf. Nation, 1983; Nation, 1990; Schmitt et al., 2001), 
and Coxhead's (2000) AWL, which consists of frequent words in academic con¬ 
texts, but which do not appear in the first 5000 words. The 10000-word level, 
another level considered by Nation and colleagues was excluded, due to the fact 
that it consists of words deemed to be too infrequent to allow us to learn much 
from scores at this level, given the proficiency level of the initial sample popula¬ 
tion (Burundians) of the study (cf. Nizonkiza, 2011b). Ten words (cf. Nation & 
Beglar, 2007) were selected from each of the word frequency bands, making a 
total of 40 target items. The target words, known as nodes, had to be nouns and 
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were selected using systematic random sampling (Babbie, 1990; Dagnelie, 
1992), according to which each nth (100th in this case) word is selected from a 
random starting point. Whenever the 100th word was not a noun, the next 
noun was selected instead. 

The next step was to select their collocates (verbs in the V + N combina¬ 
tion) from the Oxford Collocations Dictionary for Students of English (2002). 
The frequency of collocates was also controlled and they had to be of higher 
frequency level than the nodes (cf. Gyllstad, 2007) or similar frequency, in case 
no collocate of higher frequency was found. The whole selection procedure is 
summarised through the following steps: 

• A noun was selected from Nation's (2006) word frequency count. 

• All the verbs collocating with it in the V + N combination (from the 
Oxford Collocations Dictionary for Students of English, 2002) were listed. 

• Their frequency level was checked in Nation's (2006) word frequency count. 

• The verbs of similar frequency level, if not possible to find higher level, 
were retained. 

• An online collocation sampler, 5 which gives different collocates of the 
node, and information on how many times they appear in the Bank of 
English, how many times they cooccur with the node, and how 
significantly they do so, was run with the most significant collocate 
considered for selection. 

For instance, the collocates of the word accuracy include of, with, be, 
correct, checked, ensure, lack, predict, fly, and so on, up to the 100th 
cooccurring word. Improve, increase, check, confirm, test, ensure, doubt, and 
question are presented in the Oxford Collocations Dictionary for Students of 
English (2002) as the relevant V + N combinations. I selected ensure because it 
belongs to the 1000-word level, being thus more frequent than accuracy, 
which belongs to the 2000-word level, and collocates with it more significantly 
with mutual information (M I) of 2.3, higher than the other verbs of the V + N 
combination. The collocations were presented in a sentential context with the 
verb to the left of the noun. 

As regards the format, the test was modelled on Laufer and Nation's 
(1999) Vocabulary Levels Test active version, which presents words in a sen¬ 
tential context. In the present test, once the collocates were identified, au¬ 
thentic illustrations were selected from the Oxford Collocations Dictionary for 
Students of English (2002), which was chosen because it was designed as a 
learning tool, compiled on the basis of the BNC (frequency of collocations was 


5 The collocation sampler is available online at: http://www.collins.co.uk/Corpus/CorpusSe 
arch.aspx 
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checked from the corpus), and containing sentences from the BNC, or with 
minor modifications aimed at making them more accessible to learners, with¬ 
out altering the meaning of the collocations. 

Laufer and Nation's (1999) original test was developed from Nation's 
(1990) Vocabulary Levels Test and was designed to test the controlled produc¬ 
tive ability, which refers to 

the ability to use a word when compelled to do so by a teacher or researcher, 
whether in an unconstrained context such as a sentence writing task, or in a con¬ 
strained context such as a fill in task where a sentence context is provided and the 
missing target word has to be supplied, (p. 37) 

The verb was deleted (in each sentence) with the first two letters provided and 
underlined in order to avoid wildly varying answers (cf. Laufer & Nation, 1999). 
Test-takers were instructed to complete the underlined word whereon an ex¬ 
ample was provided so as to ensure transparency (see the example below). 

Instruction: Complete the underlined words in the sentences below. 

Example: She is conducting campaigns to at . new clients. 

She is conducting campaigns to attract new clients. 


Results 

Controlled productive knowledge of collocations grows with proficiency. 

The first aim of the study was to measure the extent to which controlled productive 
collocational knowledge grows with proficiency. As the test-takers were ranked and 
grouped in five levels of proficiency according to their TOEFL scores, the collocation 
test scores were analysed in this light. 6 The means and standard deviations are pre¬ 
sented in Table 1 and show that the same levels are reflected through the colloca¬ 
tion test scores with much variability at the beginner levels. The means significantly 
distinguished between the levels as indicated by the results from a one-way analysis 
of variance (ANOVA) and its related significance level of 0.000,2-tailed. 

Post-hoc comparisons that used the Scheffe test were conducted and 
indicate that the mean differences between the different levels (Table 2, Col¬ 
umn 2) and their related significance levels (Table 2, Column 4) are statistically 
significant except the difference between Levels 1 and 2. What we learn from 
this finding is that Levels 1 and 2 actually belong to one group. 


6 Although not reported here, reliability of items was measured. The Cronbach’s alpha, 
which is .90, indicates that the test is internally consistent although a few items (5), whose 
corrected item total correction is below Ebel's (1979) scale cut-off point (.19), need revising. 
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Table 1 M ean scores on Collocation Test 


Level 

N 

Mean 

SD 

1 

33 

20.85 

4,18 

2 

42 

21.02 

5.06 

3 

40 

25.85 

4.37 

4 

40 

33.38 

2.46 

5 

30 

36.50 

1,99 


Table 2 Groups set by Scheffe test 


(I) (J) Mean difference Std. error S ij. 

level level (l-J) 


2 

’ l 

5 

-.175 

-5.002' 

-12.527' 

-15.652' 

.901 

.911 

.911 

.978 

1.000 

.000 

.000 

.000 

1 

.175 

.901 

1.000 

P 3 

-4.826' 

.856 

.000 

2 4 

-12.351' 

.856 

.000 

5 

-15.476' 

.926 

.000 

1 

5.002' 

.911 

.000 

o 2 

4.826 

.856 

.000 

3 4 

-7.525' 

.867 

.000 

5 

-10.650' 

.936 

.000 

1 

12.527' 

.911 

.000 

4 2 

12.351' 

.856 

.000 

4 3 

7.525 

.867 

.000 

5 

-3.125’ 

.936 

.028 

1 

15.652' 

.978 

.000 

* 2 

15.476 

.926 

.000 

5 3 

10.650' 

.936 

.000 

4 

3.125’ 

.936 

.028 

*. The mean difference is significant at the 0.05 level. 

The predictive relationship 

between 

collocational knowledge and L2 

proficiency was studied further by fitting a regression line (Figure 1). The for- 

mula of the regression line can be expressed in the following terms: Y=bX+a 

(cf. Salkind, 2011), where Y = proficiency, expressed by TOEFL score; b =the 

slope or direction of the line; X= 

the score used as the predictor, collocation 


test in this case; and a =the point at which the line crosses the y-axis. Using 
the coefficients from Table 3, the formula can be numerically written as fol¬ 
lows: Y =11.215X +152.826. We can therefore use this equation to predict the 
level of proficiency, namely, TOEFL score (Y), given any score in the collocation 
test (X). As Figure 1 shows, the regression line has a positive slope, which re- 
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fleets a positive correlation (.837, Appendix B) between knowing collocations 
and level of proficiency. Then, it appears from this finding, that the more col¬ 
locations a learner knows, the more proficient he is, suggesting that controlled 
productive knowledge of collocations grows alongside proficiency level. How¬ 
ever, as can be seen from Table 2, the post-hoc test put proficiency Levels 1 
and 2 in the same group, implying that knowledge gained from one proficiency 
level to another is not always significant. The findings above confirm the first 
hypothesis of the study, which says that controlled productive knowledge of 
collocations grows with proficiency, but the gain from one level of proficiency 
to another is not always significant. 


Table 3 Collocation-proficiency regression model 



Unstandardized coefficients 

Standardized coefficients 

t 

Sig. 

i v i oaei 

B 

Std. error 

Beta 

(Constant) 

COLLOTTOT 

152.826 

15.267 


10.010 

.000 

11.215 

.542 

.837 

20.698 

.000 


Note: Dependent variable: TOEFL (proficiency) 


Collo predicts proficiency 



Figure 1 Correlative links between TOEFL and Productive Collocation Test 
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Word frequency affects controlled productive knowledge of colloca¬ 
tions. The second issue addressed in the study is the extent to which con¬ 
trolled productive knowledge of collocations of L2 learners develops according 
to word frequency levels. As words were selected from different word fre¬ 
quency levels, a one-way repeated ANOVA, involving the word frequency lev¬ 
els the collocation test consists of, was performed and results are presented in 
Table 1, Appendix A. They show that the overall difference in mean scores is 
statistically significant as shown by the Sphericity Assumed Correction Test 
and its associated significance level that is 0.000. 

However, the test (Sphericity Assumed Correction) does not point to 
where significant differences occur even though it can be seen from the mean 
scores that the higher the frequency band, the higher the score. The mean is 
7.65 at the 2000-word level; it drops slightly to 7.42 and 6.62, respectively at 
the 3000-word level and AWL; while it drops dramatically to 5.53 at the 5000- 
word level. The data were therefore analysed further using the Bonferroni 
post-hoc test, which is a multiple-comparison test that shows where the dif¬ 
ferences are significant (Table 2, Appendix A). From the second column (M ean 
difference l-J), we can see that the differences are statistically significant be¬ 
tween all the word frequency levels except between the 2000-word and 3000- 
word levels. 7 This confirms the second hypothesis of the study, according to 
which controlled productive knowledge of collocations of L2 learners increases 
from the less frequent to the more frequent word levels. 

Quantifying collocation gains across proficiency and word frequency 
levels. The third aim which the study addressed was the quantifiability of col¬ 
locations gained across proficiency and word frequency levels. In order to 
quantify additions according to proficiency, means from Table 1 were used. 
The last column shows that means are higher at higher levels of proficiency 
and range between 20.85 and 36.35. The mean differences between two suc¬ 
cessive proficiency levels, which are 0.17 from Level 1 to Level 2, 4.83 from 
Level 2 to Level 3, 7.53 from Level 3 to Level 4, and 2.97 from Level 4 to Level 
5, virtually represent estimates of collocations learners can add from one level 
of proficiency to another. Overall, additions tend to be small at the beginner 
(between Levels 1 and 2) and advanced (between Levels 4 and 5) levels, while 
impressive gains are observed at the intermediate (between Levels 2 and 3 
and especially between Levels 3 and 4) levels, clearly indicating that the addi¬ 
tions depend on proficiency levels. 


7 In Table 2 in Appendix A, 1 stands for 2000-word, 2 for 3000-word, 3 for 5000-word, and 
4 for AWL. 
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In order to quantify additions according to word frequency levels, means 
were computed and are presented in Table 4. 

Table 4 Collocation means across word frequency levels 



N 

M inimum 

M aximum 

Mean 

SD 

2000-word 

185 

2 

10 

7.65 

2.01 

3000-word 

185 

1 

10 

7.42 

2.09 

AWUst 

185 

1 

10 

6.62 

2.14 

5000-word 

185 

1 

10 

5.53 

2.14 


The mean scores in Table 4 stand for estimates of collocations likely to be 
known at each word frequency level. The standard deviations do not differ 
across the frequency bands although they are slightly lower at higher word 
frequency levels, levels likely to be better known. According to N. Schmitt 
(personal communication, 2003 as cited in Xing & Fulcher, 2007), the expected 
score at an acquired word frequency level should be 80%, which means 8 out 
of 10 in the present case. 8 The scores were weighed against this scale, which 
shows that learners need to add at least 0.35 at the 2000-word level, 0.48 at 
the 3000-word, 1.38 at the AWL, and 2.47 at the 5000-word. Clearly, it will 
take much more time to take learners to the 5000-word level than to the 
2000-word, where they need 2.47 and 0.35, respectively. In other words, more 
time is needed in order to take learners to a less frequent word band than to a 
more frequent one, which implies that words acquired depend on frequency 
bands with more words added at higher ones. 

The two findings above allow me to confirm the third hypothesis, ac¬ 
cording to which collocational knowledge added can be quantified and the 
gains depend on both proficiency and word frequency levels. 

Discussion 

The present study attempts to measure controlled productive 
knowledge of collocations, operationalised through verb-noun combinations, 
across proficiency and word frequency levels. In order to achieve the three 
aims of the study, a proficiency test (TOEFL), and a collocation test were pre¬ 
sented to Belgian and Burundian English majors. The first aim pursued in the 
study is the extent to which controlled productive knowledge of collocations 
increases as proficiency develops. The proficiency measure used to allocate 
participants to proficiency groups distinguishes between five different groups, 


8 Ten words were selected from each word frequency band and students were awarded 1 
point per correct answer and 0 points for a wrong answer. 
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the same groups that are also reflected in the collocation test with significant 
differences. However, the differences between the levels are consistently 
more moderate at beginning and advanced levels, namely, from Level 1 to 
Level 2 and from Level 4 to Level 5 (cf. Table 1), which empirically supports 
Laufer's (1998) observation that vocabulary growth is slow at beginning levels 
and gains momentum as proficiency increases. It complements Laufer's study 
by showing that a plateau in collocations' growth can be reached, confirming 
Li and Schmitt (2009), who did not find any progress in terms of collocation 
production among advanced Chinese learners of English. The study also sup¬ 
ports the nonlinearity nature of vocabulary growth (Laufer, 1998; Meara, 
1996; Melka, 1997; Read, 2004). For these scholars, word acquisition is not a 
steady progression along a continuum and has shifting and transition zones, 
especially from receptive to productive knowledge: two levels of word 
knowledge which I believe characterise proficiency levels of learners. 

While the present study confirms Nizonkiza's (2011b) findings, it also 
presents empirical evidence for his assumption that collocation growth is slow 
at low levels, gains momentum at intermediate levels and stabilises and even 
reaches a plateau at very advanced levels, an assumption formulated on the 
basis of Laufer's (1998) and Li and Schmitt's (2009) observations above. How¬ 
ever, reservation should be made as regards the predictive power of con¬ 
trolled productive knowledge of collocations over L2 proficiency. While con¬ 
trolled productive knowledge of collocations is a predictor of overall proficien¬ 
cy, it may not be a reliable one at beginner levels. 

The second question addressed in the study is the role of word frequen¬ 
cy in controlled productive knowledge of collocations. As the test items were 
selected from different word frequency levels, the test scores were analysed 
accordingly and results indicate that the differences in mean scores are signifi¬ 
cantly different between each two word frequency levels, except between the 
2000-word and 3000-word levels. The presence of upper intermediate and 
advanced learners among the participants may account for the less significant 
difference between scores at these two word frequency levels that are at the 
borderline of the frequency cut-off point. According to Schmitt et al. (2001), 
the cut-off point of frequency is the 2000-word level. Results also indicate that 
the higher the frequency band is, the higher the score will be, which highlights 
the fundamental role played by frequency in knowing words (Beglar, 2010; 
Nation, 1983; Nation, 1990; Nation & Beglar, 2007). This finding extends the 
role played by frequency in word knowledge, which has empirical support at 
the vocabulary size level, to controlled productive knowledge of collocations. 

The quantifiability of collocations gained across proficiency and word 
frequency levels is the third aim of the study. Estimates of collocations that 
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can be added according to proficiency are represented by mean scores differ¬ 
ences between each two successive proficiency levels. They tend to be smaller 
at beginner and advanced levels than at intermediate levels, clearly demon¬ 
strating that collocation gains are dependent on proficiency level. 

As regards collocations gained according to word frequency levels, mean 
scores at each frequency word band were weighed against Schmitt's cut-off 
point of an acquired word frequency band. The differences between the actual 
scores and the cut-off point were found to be minor at higher word frequency 
bands, gradually becoming more substantial at less frequent word bands. This 
implies that gains depend on word frequency band and it logically takes less 
time to take a learner from the 2000-word level to the 3000-word than taking 
her/him from the 3000-word to the 4000-word for instance. The teaching im¬ 
plication from the above findings is that frequency should definitely be at¬ 
tended to when selecting collocations to teach in addition to learners' profi¬ 
ciency levels. Focus should be put on the most frequent words first, namely, 
up to the 2000-word level while teaching collocations, which has support at 
the vocabulary size level (Nation, 2006), where it is suggested that the 2000- 
word level should be explicitly taught while the other vocabulary levels can 
simply be taught through reading. 

The present study suffers chiefly from two major drawbacks. Firstly, the 
vocabulary size of the participants was not tested by means of a standardised 
vocabulary size test. This would have allowed me to know whether the collo¬ 
cation test scores at a given word frequency band was low because of individ¬ 
ual items or because of collocations, especially at infrequent word frequency 
bands. It would also have allowed comparing the findings of the present study 
with those at the vocabulary size level. Furthermore, the study did not do any 
qualitative analysis of the test items, which is the only way to address the 
main limitations of the test construct, especially when the context and the two 
letters provided are analyzed. 


Conclusion 

As discussed above, the results of the study suggest that (a) controlled 
productive knowledge of collocations develops parallel to L2 proficiency as the 
same proficiency levels distinguished by means of TOEFL are reflected through 
the collocation test scores, (b) frequency is found to play a fundamental role in 
controlled productive knowledge of collocations' growth as the test-takers 
gradually scored better from the less frequent to the more frequent levels, 
and (c) collocational knowledge growth can be quantified, where the gains are 
dependent on both proficiency and word frequency levels. 
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The present study has achieved the set objectives, but has also posed 
challenging questions worth considering in future research. The findings of the 
study show that more proficient L2 learners do better and that more frequent 
collocations are better mastered. However, the study did not make any refer¬ 
ence to the teaching approaches the participants followed, which was practi¬ 
cally impossible given the wide range and background of participants. A fol¬ 
low-up study, more experimental in nature, in associating an approach to 
teaching collocations and test scores of participants, would tell us more about 
controlled productive knowledge of collocations and thereby enhance the 
quality of the present study, which only gave an overall indication of the 
collocational knowledge growth across proficiency and word frequency levels. 
Moreover, the present study is semi-longitudinal, namely, the participants 
were specifically selected from different learning levels (year one, year three, 
and year four among the Burundian students) and more proficient participants 
(Belgian students), and the question is whether or not a purely longitudinal 
study would come up with the same observations. Furthermore, the test used 
provides the first two letters of the word to be supplied (the collocate), which 
is actually the main limitation of the test; it remains to be seen whether or not 
the same test taken without the first two letters provided would lead to the 
same conclusions. It would be interesting to explore this in a follow-up study. 
Extending this study to other types of collocations will certainly yield interest¬ 
ing results too, which will contribute towards modelling collocations better 
than they are today. 

In summary, the study has clearly demonstrated that collocational growth 
follows proficiency levels as well as frequency of words, which lays basic ground 
work for a collocation-based syllabus. For instance, the Oxford Collocation Dic¬ 
tionary for Students of English (2002) and Nations's (2006) word frequency counts 
considered in developing the collocation test used in the study can also be con¬ 
sidered in selecting 'which' collocations to teach. This kind of selection along with 
the awareness raising approaches reported in Barfield and Gyllstad (2009), Boers 
and Lindstromberg (2009), or the cognitive-linguistics-inspired pedagogy reported 
in Boers and Lindstromberg (2008) will definitely take this debate a step further, 
especially now that the question of teaching collocations is much more related to 
what aspects to teach and how to teach them (Granger & Meunier, 2008). It is 
hoped that the study has made a considerable step in this direction. All the above 
studies, though conducted in different contexts using different tasks, point to the 
same observation, that raising students' awareness of collocations improves their 
knowledge of collocations. My study, which has shown that moderate gains of 
collocations are found at low and advanced levels while impressive gains of collo¬ 
cations are found at intermediate levels, sheds some light as to finding out exactly 
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what collocations to teach at which learning stages, namely, deciding on what 
collocations to teach at which level of proficiency taking into account both word 
frequency and proficiency levels. 
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APPENDIX A 


One-way Repeated ANOVA Tables 


Table 1 Word frequency levels one-way repeated ANOVA 



Source 

Type III sum of 

df 

Mean F 

Sg. 

Partial eta 



squares 


square 


squared 


Sphericity assumed 

509.939 

3 

169.980 119.480 

.000 

.394 

Level 

Greenhouse-Geisser 

509.939 

2.874 

177.451 119.480 

.000 

.394 

Huynh-Feldt 

509.939 

2.924 

174.392 119.480 

.000 

.394 


Lower-bound 

509.939 

1.000 

509.939 119.480 

,000 

.394 


Sphericity assumed 

785,311 

552 

1.423 



Error 

Greenhouse-Geisser 

785.311 

528.758 

1.485 



(level) 

Huynh-Feldt 

785.311 

538.034 

1.460 




Lower-bound 

785.311 

184,000 

4.268 



Table 2 M ultiple comparisons of means at word frequency levels 

(1) 

(J) Mean 

Std, error 

S/g." 95% Confidence interval for difference" 

level 

level difference (l-J) 



Lower bound 

Upper bound 


2 .238 

.114 

.229 

-.066 


.542 

1 

3 2.124’ 

.121 

.000 

1,800 


2.448 


4 1.038* 

.115 

,000 

.732 


1,343 


1 -.238 

.114 

.229 

-.542 


.066 

2 

3 1,886' 

.139 

,000 

1.515 


2.258 


4 .800* 

.125 

.000 

.466 


1.134 


1 -2.124’ 

.121 

,000 

-2.448 


-1.800 

3 

2 -1,886* 

.139 

,000 

-2.258 


-1.515 


4 -1.086* 

.128 

,000 

-1.428 


-.745 


1 -1.038* 

.115 

,000 

-1,343 


-.732 

4 

2 -.800* 

.125 

,000 

-1.134 


-.466 


3 1.086* 

.128 

,000 

.745 


1.428 


a. Based on estimated marginal means 

*. The mean difference is significant at the .05 level, 

b. Adjustment for multiple comparisons: Bonferroni. 
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APPENDIX B 

Correlations between TOEFL, collocation test and word frequency levels 


Table 3 Correlations 




2000-word 

3000-word 

5000-word 






level 

level 

level 

AWUST 

COLLOTTOT TOEFLTOT 

2000-word 

Pearson 

1 

,717” 

,686. 

.721” 

.894. 

.746" 

level 

correlation 
Sig. (2-tailed) 


.000 

,000 

.000 

,000 

.000 


N 

185 

185 

185 

185 

185 

185 

3000-word 

Pearson 

.717” 

1 

.600” 

.677" 

.858” 

.730" 

level 

correlation 
Sig. (2-tailed) 

,000 


,000 

,000 

.000 

.000 


N 

185 

185 

185 

185 

185 

185 

5000-word 

Pearson 

.686. 

.600” 

1 

.670” 

.850. 

.700” 

level 

correlation 
Sig. (2-tailed) 

.000 

.000 


.000 

.000 

.000 


N 

185 

185 

185 

185 

185 

185 

AWUST 

Pearson 

.721” 

,677. 

.670” 

1 

.882. 

.740" 


correlation 
Sig. (2-tailed) 

.000 

.000 

.000 


,000 

.000 


N 

185 

185 

185 

185 

185 

185 

COLLOTTOT 

Pearson 

.894. 

,858. 

.850. 

.882. 

1 

.837” 


correlation 
Sig. (2-tailed) 

.000 

,000 

.000 

.000 


.000 


N 

185 

185 

185 

185 

185 

185 

TOEFLTOT 

Pearson 

correlation 

—1 A r*** 

.746 

.730” 

.700 

.740 

.837” 

1 


Sig. (2-tailed) 

.000 

.000 

,000 

,000 

,000 



N 

185 

185 

185 

185 

185 

185 


**. Correlation is significant at the 0.01 level (2-tailed), 
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Quantifying controlled productive knowledge of (allocations across proficiency and word frequency... 


APPENDIX C 


Productive Collocation Test 


Productive Vocabulary Test _ 

Name: Date: 

Level of study (year): Start hour: 

University: End hour: 

Instruction: Complete the underlined words in the sentences below. 

Example: She is conducting campaigns to at. . new clients. 


She is conducting campaigns to attract new clients. 


1. I ha. no intention of changing jobs because I am happy where I am. 

2. Enemy planes were seen to dr. bombs along the railway line. 

3. They always pa. a 10% commission on every sold encyclopaedia. 

4. I wonder, this unusual building seems to barely fi. the definition of a house. 

5. Better sa. your energy not trying to persuade people who are not interested. 

6. She asked him if he could ke. a secret before telling him the horrible story. 

7. Great care is being taken to en. the accuracy of research data with good plan¬ 

ning, several revisions and rewrites as part of the procedure. 

8. She felt she would ma. a terrible mess of her life if she were to throw every¬ 

thing overboard now. 

9. They did not qe. the permit for a street demonstration against university fees 

they had applied for a couple of months ago. 

10. Her appointment will fi . the gap created when the marketing manager left. 

11. They held celebrations to ma. the anniversary of Mozart’s death. 

12. It is common practice that when a song ends, the performer has to ta. a bow. 


13. They plan to se. congratulations to Tony on his new job and bought a nice card. 

14. Wecould he. afaintecho, before it slowly died away. 

15. Victory will br. glory, fame, and riches to the football team. 


16. She inherited all the family precious stones, but she does not like to we. jewellery. 

17. In M ay and J une, females leave the males to bu. a nest and incubate their eggs. 

18. She joined the navy where she expects to re. the rank of captain before retiring. 

19. He is a person who can se. his soul to the devil provided he gets money. 

20. Why didn’t the referee bl. the whistle just before he shot the goal; it would 

have prevented the clash between rival supporters. 

21. When she got pregnant at the age of 16, she decided to ha. an abortion. 

22. The estate expects to ho. an auction to raise money. 


23. Our party should en. diversity, not division, in order to attract new members. 

24. How do you ex. the discrepancies between the money and the receipts? 


25. Jumbo jets somehow la. the glamour of the transatlantic liner which has an 

impact on the number of passengers. 

26. She had a short time to dress and ap. lipstick before rushing out to the party. 

27. The burglars had to br. a pane of the front window to enter the house. 

28. He vowed to ta. revenge on the man who had killed his brother. 
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Deogratias Nizonkiza 


29. They have decided to ch. the catwalk stereotype of the skinny model. 

30. They called on the government to help pro. native wildlife as a response to 

the major environmental concerns of the century. 

31. She was hoping she would not have to qi. evidence in court. 


32. I can’t re. any conclusions from their vague observations. 

33. She had to pa. some compensation for the damages she had caused. 


34. With the new computer, you can ha. access to all the files. 

35. The mechanic can ma. the necessary adjustments to the broken engine. 

36. M any universities in the UK ch. special fees to overseas students. 

37. His sound argument will la. the foundations for future cooperation between 

the two countries. 

38. We have to fo. the safety guidelines laid down by the government. 

39. It is the duty of the local community to pr. accommodation for the homeless. 

40. He wasfound to su. from clinical depression after several months of hospitalisation. 
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